Skip to content

Support duplicate bag in insert and lookup #1718

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 17 commits into
base: main
Choose a base branch
from

Conversation

jgonet
Copy link
Contributor

@jgonet jgonet commented Jun 17, 2025

Merge after #1614. New code after "ets hash table: rename status enum".

These changes are made under both the "Apache 2.0" and the "GNU Lesser General
Public License 2.1 or later" license terms (dual license).

SPDX-License-Identifier: Apache-2.0 OR LGPL-2.1-or-later

Comment on lines +3341 to +3618
switch (key_atom) {
case NAMED_TABLE_ATOM: {
is_named = true;
break;
}
case PRIVATE_ATOM: {
private = true;
public = false;
break;
}
case PUBLIC_ATOM: {
private = false;
public = true;
break;
}
case SET_ATOM: {
is_duplicate_bag = false;
break;
}
case DUPLICATE_BAG_ATOM: {
is_duplicate_bag = true;
break;
}
case KEYPOS_ATOM: {
VALIDATE_VALUE(value, term_is_integer);
avm_int_t keypos = term_to_int(value);
if (UNLIKELY(keypos < 1)) {
RAISE_ERROR(BADARG_ATOM);
}
key_index = keypos - 1;
break;
}
default:
RAISE_ERROR(BADARG_ATOM);
}
options = term_get_list_tail(options);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking about adding interop_parse_kw_opts(term kw, &opts) but this loop relies on ability to override keys when they're repeated (e.g. [set, duplicate_bag, set]). Maybe in the future.

}

// TODO: create list elsewhere, by copying terms from orig heap, appending new copied tuple and using ets_hashtable_new_node
struct HNode *ets_hashtable_new_node_from_list(term old_tuples, term tuple, size_t key_index)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This copies old tuple list from previous node's heap, copies tuple from process' heap and merges them together on the new heap. TODO is for removing this function and using utility function to do that and then just pass new heap to new_node.

Copy link
Collaborator

@bettio bettio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm adding a first round of comments here, I did not yet finished this review, so more comments will follow.
A rebase on latest main would defintely help, since the previous ets PR has been merged.

@@ -3297,35 +3297,96 @@ static term nif_erlang_raise(Context *ctx, int argc, term argv[])
return term_invalid_term();
}

static inline term get_option_key(term option, term *maybe_value)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it took me a moment to find the purpose of this function, so I suggest naming it extract_option_key_value().

Also, if you plan to make something generic (while keeping a clean and simple API) that can be used elsewhere as well, returning true as value for atoms might be a good choice (while using undefined when no property is found).

VALIDATE_VALUE(key_atom, term_is_atom);

switch (key_atom) {
case NAMED_TABLE_ATOM: {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fun fact: on OTP it is not a proplist, so ets:new(foo, [{named_table, true}]). raises badarg, but ets:new(fiz, [named_table]) doesn't.
So I think this can be handled with a simple change:

switch (head) {
   case NAMED_TABLE_ATOM:
      ...
   default:
       VALIDATE_VALUE(head, term_is_tuple);
       term value;
       term key_atom = get_option_key(head, &value);
       VALIDATE_VALUE(key_atom, term_is_atom);

ets:new(foo, [{named_table, true}]). also, let's make sure in tests that fails as expected.

jgonet added 17 commits July 8, 2025 21:46
Signed-off-by: Jakub Gonet <jakub.gonet@swmansion.com>
Signed-off-by: Jakub Gonet <jakub.gonet@swmansion.com>
Signed-off-by: Jakub Gonet <jakub.gonet@swmansion.com>
ETS mixes position (1-based) with indexing (0-based).
Tuple are 0-indexed so we should convert to it at earliest possible moment.

This commit doesn't introduce any change in behavior.

Signed-off-by: Jakub Gonet <jakub.gonet@swmansion.com>
Signed-off-by: Jakub Gonet <jakub.gonet@swmansion.com>
Signed-off-by: Jakub Gonet <jakub.gonet@swmansion.com>
ETS supports setting different key position (default: first element in tuple) in ets:new/2.
Checking if updated element shouldn't be placed as first element is wrong.

Signed-off-by: Jakub Gonet <jakub.gonet@swmansion.com>
Additionally, unifies tuple arity checks (no change in behavior).

Signed-off-by: Jakub Gonet <jakub.gonet@swmansion.com>
Signed-off-by: Jakub Gonet <jakub.gonet@swmansion.com>
Signed-off-by: Jakub Gonet <jakub.gonet@swmansion.com>
Signed-off-by: Jakub Gonet <jakub.gonet@swmansion.com>
Signed-off-by: Jakub Gonet <jakub.gonet@swmansion.com>
Signed-off-by: Jakub Gonet <jakub.gonet@swmansion.com>
Signed-off-by: Jakub Gonet <jakub.gonet@swmansion.com>
Signed-off-by: Jakub Gonet <jakub.gonet@swmansion.com>
Signed-off-by: Jakub Gonet <jakub.gonet@swmansion.com>
Signed-off-by: Jakub Gonet <jakub.gonet@swmansion.com>
Copy link
Collaborator

@bettio bettio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left some minor comments, I think this PR is already quite good.

Let's remember to update this in CHANGELOG, so we can let everyone know that we have support for additional options: there should be already an entry for ETS, there is no need to add a new one since that feature has not been released yet, let's just make sure that the ets line is updated.

Important: it looks like "test_ets:FAILED" is failing on the BEAM, please check the error and let's make sure that tests behave the same way on the BEAM and AtomVM.

ok = assert_badarg(fun() -> ets:update_counter(Tid, foo, 10) end),
% ok = assert_badarg(fun() -> ets:update_element(Tid, foo, {1, error}) end),

% true = ets:delete_object(Tid, {bad, bad}),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In case we are leaving some commented out tests, let's write a comment why.

{
assert(term_is_tuple(tuple));
assert(term_get_tuple_arity(tuple) >= key_index);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add an empty line after this, in order to separate the logical block with assertions from the rest of the function.

if ((size_t) term_get_tuple_arity(entry) < keypos + 1) {
while (term_is_nonempty_list(list)) {
term tuple = term_get_list_head(list);
nodes[last_i] = tuple_to_insertion_node(ets_table, tuple, global);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did a check here, but I'd love a confirm from your side: in case of OOM, do we remain in a consistent state? I want to be sure we don't leave half inserted stuff or any other incosistency.

} else if (is_duplicate_bag) {
assert(term_is_list(ets_entry));
// for tuple list and it reversed version - we don't want to copy terms in the loop
int _proper;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we have an UNUSED(x) macro, let's use it.
Also, in C, identifiers starting with a single underscore are reserved.

if (IS_NULL_PTR(ets_table)) {
return EtsBadAccess;
}

bool _found = ets_hashtable_remove(ets_table->hashtable, key, ets_table->keypos, ctx->global);
bool _found = ets_hashtable_remove(ets_table->hashtable, key, ctx->global);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Se previous comment about underscore. We can even ignore and discard return value. No need to assign it.

@@ -1704,7 +1703,6 @@ static inline term term_get_tuple_element(term t, int elem_index)
const term *boxed_value = term_to_const_term_ptr(t);

TERM_DEBUG_ASSERT((size_t) elem_index < term_get_size_from_boxed_header(boxed_value[0]));

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This empty line can be kept.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants